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Abstract 

Many modern network designs incorporate "failover" paths into routers' forwarding tables. 
We initiate the theoretical study of the conditions under which such resilient routing tables can 
guarantee delivery of packets. 

1 Introduction 

The core mission of computer networks is delivering packets from one point to another. To ac- 
complish this, the typical network architecture uses a set of forwarding tables (that dictate the 
outgoing link at each router for each packet) and a routing algorithm that establishes those for- 
warding tables, recomputing them as needed in response to link failures or other topology changes. 
While this approach provides the ability to recover from an arbitrary set of failures, it does not 
provide sufficient resiliency to failures because these routing algorithms take substantial time to 
reconverge after each link failure. As a result, for periods of time ranging from 10s of milliseconds 
to seconds (depending on the network), the network may not be able to deliver packets to certain 
destinations. In comparison, packet forwarding is several orders of magnitude faster: a 10 Gbps 
link, for example, sends a 1500 byte packet in 1.2 //sec. 

In order to provide higher availability we must design networks that are more resilient to failures. 
To this end, many modern network designs incorporate various forms of "backup" or "failover" 
paths into the forwarding tables that enable a router (or switch), when it detects that one of its 
attached links is down, to use an alternate outgoing link. We call these resilient routing tables since 
they embed failover information into the routing table itself and do not entail changes in packet 
headers (and so require no change in the low- level packet forwarding hardware). Because these 
failover decisions are purely local — based only on the packet's destination, the packet's incoming 
link, and the set of active incident links — they occur much more rapidly than the global recovery 
algorithms used in traditional routing protocols and thus result in many fewer packet losses. 

While such resilient routing tables are widely used in practice (e.g., ECMP), there has been 
little theoretical work on their inherent power and limitations. In this paper, we prove that starting 
with arbitrary loop-free routing tables, we can add forwarding rules to provide resilience against 
single failures in all scenarios (so long as the network remains topologically connected). We show, 
in contrast, that perfect resilience is not achievable in general (i.e., there are cases in which no 
set of routing tables can guarantee packet delivery even when the graph remains connected) . We 
leave open the question of closing the large gap between our positive and negative results. Other 
interesting open questions include exploring resilient routing tables in the context of specific families 
of graphs, randomized forwarding rules, and more. 

The prior work closest to ours is Failure Insensitive Routing (FIR) [6]. FIR is also able to 
guarantee resilience to a single link failure, but is restricted to starting with shortest path routing 
tables. Our result on resilience to a single failure is more general, allowing the use of arbitrary 
(loop-free) routing tables in the absence of failure; and adding rules for tolerating one failure. In 
addition, we also demonstrate the impossibility of perfect resilience. FIR does not discuss a negative 
result of this nature. 



While there is other significant past research on how to make routing more resilient, these efforts 
differ from our discussion here in one or more important respects. For instance, the literature 
discusses approaches that: (a) use bits in the packet headers to determine when to switch from 
primary to backup paths (this includes MPLS Fast Reroute) [HHIE]; (b) encode failure information 
in packet headers to allow nodes to make failure-aware forwarding decisions [3 12] (work on fault- 
tolerant compact routing [TU] also fits in this category); and (c) use graph-specific properties to 
achieve resilience (3 . Our own recent work [7j provides full resilience (i.e., guaranteed packet 
delivery as long as the network remains connected), but modifies routing tables on the fly. 

2 Model 

The network is modeled as an undirected graph G = (V,E), in which the vertex set consists of 
source nodes {1, 2, . . . , n} and a unique destination node d ^ [n]. Each node i G [n] has a forwarding 
function ff : Ei x 2 * —¥ Ei, where Ei is the set of node i's incident edges, ff maps incoming edges 
to outgoing edges as a function of which incident edges are up. We call an n-tuple of forwarding 
functions f d = (ff, . . . , fff) a forwarding pattern. 

Consider the scenario that a set of edges F C E fails. A forwarding path in this scenario is a 
route in the graph H F = (V, E\F) such that for every two consecutive edges e\, e2 on the route 
which share a mutual node i it holds that ff(e\,Ei \F) = e2- 

Intuitively, our aim is to guarantee that whenever a node is connected to the destination d, 
it also has a forwarding path to the destination. Formally, we say that a forwarding pattern / 
is t-resilient if for every failure scenario F C E such that \F\ < t, (1) if there exists some route 
from a node i to d in H F then there also exists a forwarding path from i to d in H F ; and (2) 
all forwarding paths in H F are loop-free. (Observe that the combination of these two conditions 
implies, intuitively, that a packet never enters loop en route to the destination or, alternatively, 
"gets stuck" at an intermediate node.) 



3 Positive Result 
3.1 High-Level Overview 

We now present our main result, which establishes that for every given network it is possible to 
efficiently compute a 1-resilient forwarding pattern. 

Theorem 3.1. For every network there exists a 1-resilient forwarding pattern and, moreover, such 
a forwarding pattern can be computed in polynomial time. 



We prove Theorem 3.1 constructively; we present an algorithm that efficiently computes a 1- 
resilient forwarding pattern. We now give an intuitive exposition of our algorithm. We first orient 
the edges in G so as to compute a directed acyclic graph (DAG) D in which each edge in E is 
utilized. Our results hold regardless of how the DAG D is computed. An example network and 
corresponding DAG appear in figures 1(a) and 1(b), respectively. The DAG D naturally induces 
forwarding rules at source nodes; each node's incoming edge in D is mapped to its first active 
outgoing edge in D, given some arbitrary order over the node's outgoing edges (e.g., node 4 in the 
figure forwards traffic from node 5 to node 2 if the edge to 2 is up, and to node 3 otherwise). 

Intuitively, the next step is to identify a "problematic" node, that is, a node that is bi-connected 
to the destination in G but not in the partial forwarding pattern computed thus far, and add 
forwarding rules so as to "fix" this situation. Once this is achieved, another problematic node is 
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(a) 



(b) 



Figure 1 : Illustration of high-level idea 

identified and fixed, and so on. Observe that nodes 1-4 in the figure are all problematic. Observe 
also that adding the two following forwarding rules fixes node 4 (i.e., makes node 4 bi-connected 
to the destination in the forwarding pattern): (a) when both of node 4's outgoing edges in D are 
down, traffic reaching 4 from node 5 is sent back to 5; and (b) when node 5's direct edge to the 
destination is up, traffic reaching node 5 from node 4 is sent along this edge. Thus, the algorithm 
builds the forwarding functions at nodes gradually, as more and more forwarding rules are added 
to better the resilience of the forwarding pattern. 

Implementing the above approach, though, requires care; the order in which problematic nodes 
are chosen, and the exact manner in which forwarding rules are fixed, are important. Intuitively, 
our algorithm goes over problematic nodes in the topological order <£> induced by the DAG D 
(visiting problematic nodes closer to the destination in D first), and when fixing a problematic 
node i, forwarding rules are added until a minimal node in <d whose entire sub-DAG in D does 
not traverse i is reached. We prove that this scheme outputs the desired forwarding pattern in a 
computationally-efficient manner. 

3.2 Algorithm and Correctness 
3.2.1 Algorithm 

1. Initialize. Ve = G E,VT C E, set //(e,T) := 0. 

2. Construct DAG. Construct a DAG D = (V,E D ) (e.g., using BFS/DFS) that is rooted in 
d and such that V(i, j) G E, G Ed or (j, i) G Ed. D induces the following partial order 
<D over V: G V, i <d J 'iff there is a route from j to i in D. 

3. Install DAG-based forwarding rules. Vz G V, let E D denote the set of i's outgoing 
edges in D. Choose an order over every E l D in some arbitrary manner. \/j G V such that 
e = (j, i) g E and VTC£ such that Tn E D ^ set fi(e, T) to be the highest element in E D 
that is not in T. 

4. Install additional forwarding rules. While there exists a node q that is bi-connected to 
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d in G but not in f d = (f d , . . . , /„) (that is, for which there do not yet exist at least two 
edge-disjoint forwarding paths to the destination in f d ) do: 

(a) Choose i to be a minimal node (under <d) that is bi-connected to d in G but not in 
f d =(f?,...,f%). 

(b) Choose j to be a minimal node (under <d) such that (1) i <b j and (2) 3x £ V such 
that (j, x) £ D and i x. 

(c) Choose a simple route R = (j = v\, V2, ■ ■ ■ , Vk = i) from j to i in D. 

(d) Set c := fe — 1. 

(e) While (c > 1) and (/^> c+1 ,?; c ) = 0) do: 

• /« c 0c+l,O := (foVc-l) 

• c := c — 1 

(f) If C = l,then//(« 2 ,t; 1 ):=(7,a;). 
3.2.2 Proof of Theorem EO 

We now show that the algorithm outputs a forwarding pattern f d as in the statement of Theo- 
rem 



3.1 Consider a node i chosen in Step 4b of the algorithm. 



Claim 3.2. For every node i that is bi-connected to d in G but not in f d there exists a node j such 
that (1) i <d j; an d (2) j has a directed edge in D to some node x such that i x. 

Proof. D spans all nodes in G and so there must exist a route R± from i to d in D. i is bi-connected 
to d in G and so there must also exist another route R2 that is edge-disjoint from R\ and is not in 
D (otherwise i would be bi-connected to d in D). Let j be a node on R2 that has a route R3 to 
d in D that does not go through i. We can now go over the nodes in R3 (from j to d) one by one 
until we reach a node as in the statement of the claim. □ 

Consider an iteration of Step 4 of the algorithm. Recall that the node i chosen at that iteration 
is a node that (at that point in time) is bi-connected to d in G but not in f d , and node j is a 
minimal node such that i <d j an d that has a child x in D for which i ^£>. 

We now show that following the execution of Step 4 the chosen node i becomes bi-connected to 
d in f d and thus ceases to be "problematic" . We handle two cases. 

• Case I: In the execution of Step 4, c is decreased until c = 1. Observe that in this case i (that 
already has a route to d in D) has (at the end of that iteration) two edge-disjoint forwarding 
paths to d in f d . 

• Case II: c is decreased until a non-empty "entry" in f d is reached. We now show that in this 
case, too, i has two edge-disjoint forwarding paths to d in f d at the end of that iteration. 

We now handle Case II above. For ease of exposition we illustrate our arguments on the specific 
(sub)network described in Figure [2j Recall that in Step 2 of the algorithm we construct a DAG D. 
The nodes and the red directed edges in the figure are some subgraph of D (the destination node 
d does not appear in the figure). Let i\ and j\ be the nodes i and j, respectively, chosen at some 
iteration qi of Step 2 of the algorithm, and let R\ = (ji, a, /3, i\) be the route R selected at iteration 
q. The blue directed edges in Figure [2] represent the changes to the forwarding functions made in 
the gi'th iteration (along the route R\). Let Z2 and j'2 be the nodes i and j, respectively, selected 
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Figure 2: Illustration of proof idea 



as some later iteration qi > Qi of Step 2, and let R2 = (J2, a, fi, 7, i?) be the route R selected at 
iteration q^. 

Now, suppose that at the end of iteration q\ node i\ is not only bi-connected to d in G but also 
in f d . We now show that at the end of the q2 'th iteration, 12 too shall be bi-connected to d in both 
G and / . Consider the 52 'th iteration of Step 2. Observe that at the 52 'th iteration c is decreased 
until it reached the node a as, at that point, a non-empty entry in the forwarding function is 
reached. Hence, after the (72'th iteration the route (12, 7, P, a, ji, x) exists in the network. We now 
show that 12 x and so there exists a route from 12 to d that does not intersect its routes to d 
in D. 

By contradiction. Suppose that 12 <d %■ Recall that ji was chosen at iteration q\ because it 
was a minimal node such that i\ <d ji an d has a child x in D such that i x. Hence, it must 
be that i\ <d 7 because otherwise j3 would have been chosen instead of j±. Similarly, i\ <d h 
because otherwise 7 would have been chosen instead of j±. This, combined with our assumption 
that %2 <d x implies that i\ <£> x — a contradiction! The proof of the theorem follows. 

4 Negative Result 

We say that a forwarding pattern / is perfectly resilient if it is oo-resilient — so that regardless of 
the failure scenario F C E, if there exists some route from a node i to the destination d in H F then 
there also exists a forwarding path from i to d in H F . To prove that forwarding patterns cannot 
always achieve perfect resilience, we first prove two properties of perfectly resilient forwarding 
patterns. 
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Figure 3: A failure scenario where perfect resilience is impossible. 



Lemma 4.1. For any edge e uv , if v has any working path to the destination which does not use 
the edge e vu , then v must not send a packet traveling u —¥ v back to u. 

Proof. Assume the contrary, i.e., there is a perfectly resilient forwarding pattern / with f^(e uv , E v ) = 
e vu and 3e vw £ E v ,w ^ u such that w has a working path to d. Now, consider a scenario where all 
edges at u other than e uv fail while v is connected to d through e vw . A packet from u must be sent 
to v along e uv . Then fy(e uv , E v ) = e vu implies v sends the packet back to u. u having no other live 
edges, sends it back to i, and we have a forwarding loop, even though there is a route to d. This 
contradicts the claim of / being perfectly resilient. □ 

Lemma 4.2. A node i in the destination's connected component must route in some cyclic ordering 
of Ei\F , i.e., an ordering of its edges with its neighbors v\, . . . , v m such thatVj < m : fi(vj, Ei\F) = 
Vj+i and fi(v m ,Ei \ F) = v\. For example, in figure ??, node 1 may route packets from 2 to 3, 
packets from 3 to 4, from 4 to 5, and from 5 to 2. 

Proof. Let nbrs(i) be the set of neighbors of node i. Assume the lemma is false, i.e., there is a 
perfectly resilient forwarding pattern / such that fi does not use such a cyclic ordering over nbrs{i). 
Then fi must have a smaller cyclic ordering which skips some neighbors S C nbrs(i). Consider a 
scenario where u £ S has a route to d, but all edges from nodes in nbrs(i) \ S have failed, except 
those to i. The cyclic ordering in / over nbrs{i) \ S ensures that packets loop over these nodes: 
packets starting at any node in nbrs(i) \ S are sent to i which forwards them to some other node in 
the set (per the cyclic ordering). Any such node has no other connectivity except i, so the process 
repeats ad infinitum. However, each node in nbrs(i) \ S does have a route to d through u. This 
contradicts the claim of / being perfectly resilient. □ 

Theorem 4.3. There exists a network for which no perfectly resilient forwarding pattern exists. 

Proof. Consider the example network in figure (c). We show that after certain failures, no for- 
warding pattern on the original graph allows each surviving node in the destination's connected 
component to reach the destination. In figure (c), the surviving links are shown in bold; all other 
links fail. 



By Lemma 4.2 above, node 1 has to route packets in some cyclic ordering of its neighbors. By 
the topology's symmetry, we can suppose w.l.o.g. that this ordering is 2, 3, 4, 5, 2, i.e., f d is defined 
such that 1 forwards packets from 2 to 3, packets from 3 to 4, etc. Note that a forwarding loop is 
formed when a packet repeats a directed edge in its path (rather than just a node). To show that 
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this occurs, consider the path taken by packets sent by 5 after the failures. By Lemma 4.1 packets 
sent 1—7-2 must not loop back, and so must travel 2 — > 10 —7-4—7-1. As a result the packet travels 
5— > 1 — t-2— 7-10 — t-4— 7-1— t-5— 7-1 which is a loop since the edge 5 — > 1 is repeated. □ 
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